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Abstract.  The  statements  rf  a program  do  not  always  provide  sufficient 
information  for  proving  its  correctness.  The  correctness  of  the  algorithm  implemented 
by  the  prog'am  must  often  be  proved  with  the  pure  mathematical  techniques  or 
exhaustive  enumeration.  An  example  program  is  presented  for  which  the  correctness 
proof  of  the  program  is  trivial  provided  that  the  correctness  of  the  underlying 
algorithm  can  be  demonstrated.  The  program  can  be  viewed  as  an  abstraction  of  a 
quadratic  hash  algorithm.  It  is  used  at  the  end  of  the  paper  to  encode  the  algorithm 
most  efficiently. 


I 


This  work  was  supported  by  the  Defense  Advanced  Research  Projects  Agency 
of  the  Office  of  the  Secretary  of  Defense  under  contract  F44620-7 3-C-0074 
and  is  monitored  by  the  Air  Force  Office  of  Scientific  Research. 


The  Correctness  Proof  of  a Quadratic-Hash  Algorithm  1 


1.  Proving  the  correctness  of  a program  by  Inductive  assertion  has  become  a generally 
accepted  method.  It  was  introduced  by  R.  W.  Floyd  when  he  applied  it  to  flowchart 
programs  [1}.  The  method  was  considerably  improved  by  C.  A.  R.  Hoare.  He 
expresses  the  meaning  of  program  statements  in  terms  of  a set  of  axioms  [2\  The 
truth  of  P { S }Q,  where  P and  Q are  predicates  and  S a program  statement,  can  be 
derived  from  the  axioms  that  apply  to  S.  The  significance  of  Hoare’s  improvement  of 
Floyd’s  method  lies  In  the  fact  that  now  programming  language  constructs  are  directly 
used  in  the  correctness  proof.  One  of  the  most  practical  axioms  is  the  one  that 
describes  the  while  statement  by  P{S}P  -»  P{while  B do  $}  P A not  B.  The  fact  that  the 
condition  B is  false  if  (and  when)  the  while  statement  terminates  Is  often  used  in  the 
subsequent  part  of  the  program.  Its  usefulness  is  nicely  demonstrated  in  Dijkstra’s 
program  for  computing  the  greatest  common  divisor  of  two  integer  numbers  [3]. 

However,  one  must  not  get  the  impression  that  the  inductive  assertion  method  is 
complete  in  the  sense  that  it  shows  that  the  algorithm,  implemented  by  the  program,  is 
correct.  The  inductive  assertions  can  do  no  more  than  show  that  the  program  is  a 
true  implementation  of  the  algorithm.  That  is  to  say,  the  proof  method  can  show  that 
the  program  will  execute  the  steps  which  the  algorithm  is  supposed  to  take.  If  the 
algorithm  is  wrong,  then  the  program  will  also  be  wrong.  In  addition  to  the 
correctness  proof  of  the  program,  we  also  neeo  to  give  a correctness  proof  of  the 
algorithm.  For  instance,  the  correctness  proof  of  the  algorithm  which  Dijkstra  uses  to 
compute  the  gcd  depends  not  only  o')  the  assertions  between  the  program  statements, 
but  primarily  on  the  mathematical  facts  that  gcd(a,b)  - gcd(b,a),  gcd(a,a)  ■ a and 
gcd(a,b)  - gcd(max(a,b)  - min(a,b),  min(a,b)). 
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All  correctness  proofs  have  in  common  that  long  proofs  are  rarely  convincing. 

This  is  true  for  the  proof  of  a pure  mathematical  theorem  as  much  as  for  the 
correctness  proof  of  a program.  Therefore,  a convincing  proof  cannot  be  given  unless 
the  problem  is  decomposed  in  a coherent  set  of  overseeable  parts.  This  does  not 
necessarily  imply  a top-down  approach  to  wr  ting  programs,  but  it  requires  that 
various  aostractions  of  the  problem  are  considered  sepaiately.  The  abstractions  must 
be  manageable  and  they  must  be  chosen  such  that  the  solution  can  easily  be  composed 
out  of  these.  For  small  problems  a decomposition  based  on  the  successive  steps  of 
the  algorithm  may  be  adequate,  but  Parnas  shows  that  for  large  problems  a 
decomposition  based  on  the  data  structures  and  their  operations  is  much  more  useful 
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Sometimes  the  algorithms  are  simple  and  the  programs  rather  complex  because  of 
pure  size.  An  exampie  of  such  a situation  is  the  syntax  scanner  in  the  compiler  for  a 
programming  language.  The  size  of  the  code  makes  t necessary  to  split  the  program 
into  smaller  parts.  Coming  up  with  an  exhaustive  case  analysis  is  one  of  the  most 
difficult  aspects  of  a correctness  proof  in  this  case.  In  this  paper  we  consider  the 
other  extreme,  where  the  correctness  of  the  algorithm  is  not  obvious,  but  the 
implementation  is  straightforward.  The  example  presented  in  this  paper  serves  two 
purposes.  In  the  first  place  it  shows  that  there  is  more  to  proving  the  correctness  of 
a program  than  applying  the  inductive  assertion  method.  Secondly,  the  resulting 
program  has  a practical  aspect;  it  leads  to  a very  efficient  implementation  of  a 
quadratic  ash  technique. 
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2.  Given  the  program 
begin  integer  c,h,p; 
REACKp);  REACXh); 


for  c 0 step  1 until  <p-l)/2,  (p-l)/2  step  - 1 until  1 do 
begin  h (h+c)7.p;  PRINT(h)  end 

end 

v/i  ere  h is  a non-negative  integer  and  p is  a prime  number  such  that  p - -1  (mod  4). 
(E.g.,  the  numbers  p - 2Tk-l,  where  k is  prime,  satisfy  these  conditions.)  The  symbol  / 
represents  integer  division  and  the  symbol  2 represents  the  remainder  function. 

The  question  is  to  prove  that  this  program  prints  all  the  numbers  of  the  set  S.O  * 

p- 1 } exactly  once,  no  matter  whicf  non-  negative  integer  value  is  assigned  to 

h in  the  statement  REACXh).  In  other  words,  the  program  prints  a permutation  of  the 
numbers  0,l,...,p-l,  independent  of  the  (permissible)  initial  vsiue  of  h. 

The  relevant  assertions  are 
Al:  h is  a number  in  the  range  [0..p-l] 

A2:  h is  not  equal  to  a number  already  printed 

A3:  all  the  printed  numbers  are  different  and  in  the  range  [0..p-l] 

A4:  the  number  of  printed  numbers  is  equal  to  p 

With  the  assertion  between  { },  the  program  reads  like  this: 


'I 
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beg  in  integer  c,h,p; 

READ(p);  {p  is  prime  and  p - -1  (mod  4)} 

REACKh);  {h  >-0} 

{A3} 

for  c 0 step  1 until  <p-l)/2,  (p-l)/2  step  -1  until  1 do 
begin  h (h+c)7.p;  {A1 } 

PRINT(h);  {A2} 

{A3} 

end 

{A3,A4} 

end 


Assertion  A3  must  be  an  invariant  of  the  for  statement;  it  must  be  true  after  every 
iteration.  The  proof  that  all  the  assertions  hold  is  simple  for  Al,  A3  and  A4,  but 
difficult  for  A2.  Assertion  A3  is  true  before  the  iteration  starts,  because  no  numbers 
have  been  printed  yet.  The  for  clause  shows  that  c>«  0 in  every  iteration  and 
initially  h >-  0.  Thus,  (h+c)7p  is  also  greater  than,  cr  equal  to,  zero,  so  Al  is  true. 
We  can  then  show  by  induction  that  Al  is  true  for  each  iteration.  Assuming  that  A2  is 
true,  A3  holds  at  the  end  of  an  iteration.  When  the  iteration  terminates,  A4  holds, 
because  the  for  clause  prescribes  p iterations.  The  combination  of  A3  and  A4  implies 
that  all  the  numbers  of  S.O  ■ {0,1,..., p-1}  have  been  printed  exactly  once.  So,  if 
assertion  A2  is  true,  the  program  is  trivial. 
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3.  The  proof  that  the  program  does  not  print  a number  in  the  range  [0..p-l]  twice 
cannot  be  derived  from  the  program.  But  it  can  be  proved  by  applying  some  number 
theory  and  group  theory. 

Let  S.O  ■ {0,l,2,...,p-l },  S.l  ■ {l»2,„.,p-l},  p is  prime.  Let  the  symbol  * represent 
equality  modulo  p.  The  zero-theorem  says:  if  p is  prime  and  a*b  ■ 0 for  a,b  € S.O, 
then  a ■ 0 or  b ■ 0 (or  both)  [5J.  The  proof  uses  the  fact  that  if  a*b  - m*p,  then  p 
must  be  a divisor  of  a or  of  b,  since  p is  prime. 

We  split  the  iteration  in  two  "parts**.  In  the  first  part  the  iteration  variable  c - 
0,l,...,(p-l)/2  and  in  the  second  part  c - (p-l)/2,  (p-3)/2,...,l.  First  we  prove  that  two 
numbers  h generated  in  the  first  part  are  not  equal,  then  we  prove  the  same  for  two 
numbers  in  the  second  part  and  finally  we  prove  that  a number  h generated  in  the 
first  part  is  not  equal  to  any  number  generated  in  the  second  part. 

a)  Let  h.i  and  h.j  be  two  numbers  generated  in  the  first  part.  Let  R.0  - {0,l,...,(p-l)/2} 
and  let  the  initial  value  assigned  to  h in  the  statement  READ(h)  be  H. 

h.O  ■ H,  h.i  ■ h.(i-l)+i  for  i i 1,  so  h.i  ■ H+i*(i+l)/2  for  i < R.0 

If  i,j(R.O  and  h.i  ■ h.j,  then  i(i+l)  « j(j  + l),  or  (i-j)(i+j+l)  ■ 0.  The  zero  theorem  says 
that  i ■ j or  i+j  i -1.  Since  i,j  (.  [0..(p-l)/2],  i - j or  i - j - (p-l)/2.  Thus,  if  i,j  < R.0 

and  h.i  ■ h.j,  then  i - j.  In  other  words,  all  the  numbers  h generated  in  the  first  part 
are  different. 

b)  Let  h.i  and  h.j  be  two  numbers  generated  in  the  second  part,  where  i,j  < R.l  - 
{l,2,...,(p-l)/2}.  At  the  end  of  the  first  part,  h has  the  value  H + ( p+l)*(p-l)/8.  A 
number  h.i  in  the  second  part  is  equal  to 
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h.i  ■ H + (p-l)*(p+l)/8  ' <p-l)/2  ♦ (p-3)/2  + ...  +(p-(2i-l))/2  ■ H ♦ (p-l)*(p  + l)/8 

♦ i*(p-i)/2 

If  i,j  € R.l  and  h.i  ■ h.j,  then  i(p-i)  ■ j(p-j)  or  (i-jXi+j)  ■ 0.  The  zero  theorem  says 
that  i ■ j or  i ■ -j.  The  latter  equation  has  no  solution  for  i and  j both  in  R.l.  The 
equality  i ■ j implies  for  the  given  interval  i-j.  Thus,  if  h.i  ■ h.j  in  the  second  part, 
then  i“i  In  other  words,  all  the  numbers  generated  in  the  second  part  are  different. 

c)  Let  h.i  be  a number  generated  in  the  first  part  and  h.j  a number  generated  in  the 
second  half.  If  h.i  ■ h.j,  then 

•*(i+l)/2  ■ <p-l)*(p+l)/8  ♦ j*(p-j)/2 
/JiT2  + 4ia  (pT2-l)  ♦ 4pj  - 4jT2 
<2i  + l)t2  + <2j)T2  ■ 0 

We  must  prove  that  the  sum  of  two  squares  cannot  be  equal  to  zero  modulo  p. 
This  is  not  self-evident.  In  fact,  it  is  for  many  prime  numbers  possible  that  xT2+yT2  » 
0.  E.g.  for  p - 13,  (x,y)  - (5,1)  or  (3,2)  or  (11,3),  etc.  and  for  p - 17,  (x,y)  - (1,13) 
or  (2,9)  or  (3,5),  etc.  In  the  next  section  we  prove  that  for  p - -1  (mod  4)  the  sum  of 
two  -quares  is  not  equivalent  to  zero. 

4.  It  is  well-known  that  the  set  S.l  - {l,2,...,p-l}  with  the  operation  "multiplication 
modulo  p"  is  a (commutative)  group  if  p is  a prime  number.  (The  operation  is  clearly 
associative  and  commutative.  There  remains  to  be  shown  that  there  is  for  every  pair 
a,b  e S.l  an  element  x C S.l  such  that  a*x  ■ b.  Assume  that  the  coset  V - aS.l  has 
two  equal  elements:  a*c  ■ a*d,  where  a,c,d  < S.l.  If  so,  a*(c-d)  « 0.  Since  not  a ■ 0,  c 

* d by  the  zero  theorem.  Thus,  V has  p-1  different  numbers.  Since  all  these 
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numbers  are  in  the  range  [l..p-l],  V - S.l.  This  means  there  must  be  an  element  a *y 
i V such  that  a*x  « b.) 

Next  we  show  that  there  is  a pair  x,y  < S.l  such  that  xt2+yt2  ■ 0 is  true  if  and 
only  if  there  is  an  element  t < S.l  such  that  tt2  * -1. 

Proof.  Let  x,y  < S.l  be  a pair  for  which  xT2+yT2  ■ 0 is  true.  There  exists  a t < 

S.l  such  that  tx  ■ y.  Substitution  yields  xT2(l+tT2)  ■ 0.  The  zero  theorem  says  xT2  ■ 

0 or  tf2  ■ -1.  The  former  is  not  true,  because  x < S.l,  so  the  latter  follows. 

Let  t < S.l  be  such  that  tT2  « -1.  This  implies  tt2+l  ■ 0.  Thus,  by  substitution  x - t 
and  y ■ 1 we  find  a pair  for  which  the  sum  of  the  squares  is  equal  to  zero  modulo  p. 

Finally,  we  show  that,  if  there  is  an  element  t < S.l  such  that  tT2  » -1,  then  p - 1 
(me  d 4). 

Proof.  The  set  G - {l,p-l,t,p-t}  is  a subgroup  of  S.l.  'This  is  immediately  clear  if 
we  write  out  a multiplication  table  for  G.  It  is  well-known  that  two  cosets  V « aG  and 
W - bG,  where  a,b  < S.l,  coincide  or  have  no  elements  in  common  [5J.  (Assume  V and 
W have  an  element  c in  common,  so  c * a*g  » bah,  where  g,h  ( G.  If  k is  the  inverse  of 
h,  then  k ( G,  because  G is  a group.  Then  a ■ b»g*k  is  an  element  in  G,  so  aG  - 
b*g*kG  - bG,  so  V and  W coincide.)  Therefore,  the  collection  of  all  the  cosets  of  G 
form  a complete  partioning  of  S.l.  Each  coset  has  the  same  number  of  elements  as  G. 
Hence,  if  subgroup  G has  k distinct  cosets,  k*n(G)  • n(S.l),  where  n(Z)  represents  the 
number  of  elements  of  set  Z.  Since  n(G)  ■ 4 and  n(S.l)  ■ p-1,  we  have  4k  - p-1,  so  p 
» 1 (mod  4). 

Corollary.  If  we  choose  p prime  and  p - -1  (mod  4)  then  not  p-1  (mod  4).  It 
follows  from  the  theorem  that  in  this  case  S.l  does  not  have  an  element  t such  that 
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tT2  • -1,  because  p - 1 (mod  <*}  is  not  true.  This  implies  that  there  is  no  pair  x,y  < 
S.l  such  that  xT2+yT2  ■ 0. 

We  appiy  this  result  to  part  c)  at  the  end  of  Section  3.  For  the  given  prime 
number  p - -1  (mod  4),  there  is  no  pair  of  numbers  x,y  < S.l  such  that  xT2+yT2  ■ 0,  so 
(2i+l)T2  + (2j)T2  • 0,  iff  2i+l  « 2j  ■ 0 

This  is  not  true,  because  2j  < S.l.  Thus,  all  the  numbers  generated  in  the  first  part  of 
the  iteration  are  different  from  all  the  numbers  in  the  second  part.  This  completes 
the  proof  of  assertion  A2  and  also  the  whole  correctness  proof. 


5.  This  correctness  proof  is  a clear  example  the  case  where  the  correctness  of  the 
algorithm  is  much  harder  to  prove  than  that  of  the  implementation.  The  program  is  at 
the  same  time  an  example  of  a useful  abstraction  which  can  be  used,  after  some  minor 
transformations,  in  other  programs.  The  remainder  of  this  paper  shows  how  the 
program  can  be  used  to  obtain  an  efficient  quadratic-hash  algorithm. 

It  is  hard  to  specify  exactly  what  may  be  called  a "minor  transformation"  of  a 
program  whose  correctness  has  been  proved.  The  point  is  that  we  would  like  to 
transform  the  program  into  a usable  form  without  having  to  redo  the  correctness 
proof.  A minor  transformation  of  the  given  program  would  be  the  replacement  of  the 
PRINT-statement  by  the  assignment  A[h]  1,  provided  that  array  A is  properly 
declared  and  initialized.  The  reader  can  imagine  how  that  is  done.  Another  minor 
transformation  would  Le  the  replacement  of  the  for  statement  by  a repeat  and  a while 
statement,  while  adding  the  proper  assignments  to  the  iteration  variable  c.  This 
transformation  results  in 
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begin  integer  c,h,p;  READfp);  READ(h); 
c :«  Oj 

repeat  h (h+c)7.p;  PRINT(h);  c c+1  until  c-(p+l)/2; 
while  c^Odo  c:-c-l;  h:-(h+c)7.p;  PRlNT(h)  od 
end 


A slightly  better  version  is  obtained  by  using  a more  general  form  of  repeat 
statement  with  the  syntax 

repeat  SI*  until  BE  do  S2*  od 

where  S*  is  a sequence  of  zero  or  more  statements  and  BE  a Boolean  expression. 
The  repeat  statement  is  equivalent  to  the  following  sequence  of  ALGOL  60  statements. 

LI:  SI*;  if  BE  then  goto  L2;  S2*;  goto  LI;  L2: 

The  generalized  repeat  statement  first  executes  the  statement  sequence  SI*  and 
subsequently,  as  long  as  BE  is  true,  it  executes  the  sequence  S2*;  SI*.  Note  that  the 
test  precedes  the  repeated  statements  (as  in  a while  statement)  if  SI*  is  empty;  the 
test  is  at  the  end  of  the  repetition  if  $2*  is  empty. 

[I  am  sorry  to  hurt  D.E.  Knuth’s  feelings  by  using  the  closing  delimiter  "od”  [8J.  I 
do  not  share  his  misgivings  towards  the  opening  delimiter  do.  In  the  sequence  do 
begin  I read  begin  as  an  opening  bracket,  but  not  as  a verb  in  the  English  language. 
(Do  all  native  American  programmers  read  begin  as  a verb?  I wonder.)  I started  using 
the  generalized  repeat  statement  2 1/2  years  ago  when  the  discussion  was  going  on 
about  exit  and  leave  statements.  The  fact  that  I have  used  It  ever  since  shows  its 
usefulness  to  me.  See  also  the  appendix.] 
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The  version  in  which  the  generalized  repeat  statement  is  used  reads 


begin  integer  c.h.Pi  PEAD(p);  REA0<h)j 
c 0; 

repeat  h (h+c)2p;  PRINT(h)  until  c - (p-l)/2  do  c c+1  od; 
while  c > 0 do  h:«(h+c)7.p;  PRINT(h);  c:-c-l  od 
end 

In  this  version  h is  printed  for  c - 0,l,...,(p-l)/2  by  the  repeat  statement  and  for  c 
(p-1)/ 2, — , 1 by  the  while  statement.  Note  that  at  the  end  of  the  repeat  statement  c 
is  not  unnecessarily  incremented  to  (p+l)/2. 

6 A quadratic  hash  algorithm  starts  off  with  computing  some  number  hash.O  » H(name) 
as  function  of  the  machine  representation  of  "name".  As  long  as  no  match  or  free  slot 
is  found  in  the  symbol  table  ST,  the  algorithm  computes  successive  numbers  by  the 
rule 

hash.i  - hash.O  ♦ a*iT2  «•  b*i  + c (mod  p) 
where  p is  the  length  of  the  symbol  table. 

The  coefficients  b and  c have  no  impact  on  the  probability  Pr(k)  that  a match  or  an 
empty  slot  is  found  in  precisely  k ■ i probes.  This  has  been  argued  by  C.  E.  Radke 
[6]  and  J.  R.  Bell  [7].  Thus,  the  coefficients  b and  c can  be  arbitrarily  chosen.  We 
choose  c ■ 0.  Coefficient  a must  be  unequal  to  zero,  otherwise  the  hash  algorithm 
degenerates  to  a linear  hash. 

A well-known  problem  of  the  rule  above  is  that  it  generates  only  (p  + l)/2  different 
values  for  all  i < S.O  - (0,1,..., p-1}.  If  we  choose  a - 1,  b - 0,  then  all  possible 
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squares  modulo  p are  generated  by  the  numbers  i < {0,l,...,(p-l)/2},  because  i and  (p-i) 
produce  tr,e  same  square.  (All  the  repetitions  are  generated  by  the  "upper  half" 

{(p+l)/2 P~  1 }•)  The  missing  numbers  are  found  in  the  following  way.  The  set  Q - 

{xT2  (mod  p)|x  < S.l } is  a subgroup  of  S.l.  Let  t < S.l-Q.  The  cu  et  tQ  has  no 
elements  in  common  with  Q (because,  if  it  had,  tQ  and  Q would  coincide  which  implies  t 
< Q).  The  coset  tQ  has  the  same  number  of  distinct  elements  as  the  subgroup  Q, 

(P-D/2.  Thus,  tQ  C Q - S.l,  or  tQ  - S.l-Q  - S.O-Q  u {0}.  This  means  that  tQ  contains 
all  the  missing  numbers. 

If  we  choose  p - -1  (mod  4),  we  Know  that  -1  is  not  a square,  m that  case  the 
remaining  numbers  can  he  found  by  choosing  coefficient  a - -1.  A program  that 
implements  the  quadratic  hash  for  a - (1,-1)  and  b - 0,  is 
integer  procedure  QH(name);  value  name;  integer  name; 

comment  name  is  the  internal  representation  of  an  identifier.  The  procedure  returns 
-1  if  the  Symbol  Table  is  full  and  the  name  is  not  in  the  table,  otherwise  it  returns 
the  index  of  the  ST-entry  that  holds  the  name; 
begin  integer  hash, index, c; 

hash  H(name);  comment  the  initial  hash  function  remains  unspecified  here; 
c 0;  comment  the  constant  p and  the  symbol  ST  are  global  objects; 
repeat  index  (hash+c*o)7.p 

until  ST[index]  - name  or  ST[index]  - 0 or  c - (p-l)/2  do  c c+1  od; 
if  ST[index]  * name  and  ST[index]  a 0 then 
c 1;  index  hash; 

repeat  index  :■  (hash-c*c)7.p;  if  index  < 0 then  index  index+p  fi 
until  STfindex]  “ 0 or  ST[index]  - name  or  c - (p-l)/2  do  c :■  c+1  od 


The  Correctness  Proof  of  a Quadratic-Hash  Algo-ithm 


12 


comment  we  follow  the  ALGOL  68  convention  of  terminating  an  if  statement  by  fij 
if  STfindex]  - 0 then  $T[index]  name  <i; 

QH  if  $T[index]  ■ name  then  index  elce  -1  fi 
end 

One  might  argue  that  the  new  value  of  index  can  be  computed  from  the  old  value 
of  index  by  adding  (or  subtracting)  2c+l.  This  is  true,  but  the  sum  should  be  written 
as  c+c  + 1,  otherwise  the  computation  is  even  longer  than  c*c!  The  variable  index 
cannot  be  deleted,  because  the  original  value  of  hash  is  needed  before  the  second 
iteration  begins. 

We  find  significant  improvements  of  the  quadratic  hash  program  if  we  observe  the 
similarity  between  this  program  and  the  one  whose  correctness  we  proved.  Noticing 
the  similarity  allows  us  to  delete  the  variable  index,  the  conditional  test  in  the  second 
iteration,  and  it  allows  us  to  replace  the  product  c*c  by  the  single  term  c.  The 
resulting  program  for  QH  is 

integer  procedure  QH(name);  value  name;  integer  name; 
begin  integer  hash.c; 
hash  :«  H(name);  c :«  0; 
repeat  hash  (hash+c)7.p 

until  STfhash;  - 0 or  STfhash]  - name  or  c - (p-1  )/2  do  c c+1  od; 
while  STfhash]  « 0 and  STfhash]  « name  and  c > 0 do 
begin  hash  (hash+c)7.p;  c :«  c-1  end; 
if  STfhash]  ■ 0 then  STfhash]  :■  name  fi; 

QH  if  STfhash]  - name  then  hash  elce  -1  fi 
end 
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Conclusion 

The  correctness  proof  cannot  always  be  derived  from  the  program  structure.  It  is 
o'ten  necessary  to  prove  the  correctness  of  an  algorithm  by  pure  mathematical 
methods  or  by  exhaustive  enumeration.  Once  the  correctness  of  the  algorithm  has 
been  established,  the  correctness  proof  of  the  program  amounts  to  proving  that  the 
program  is  a true  implementation  of  the  algorithm. 

A correctness  proof  becomes  intractable  if  the  problem  is  not  split  into  a coherent 
set  of  subproblems  which  are  of  smaller  complexity  than  the  total  problem.  The 
subproblems  are  constructed  by  modeling  a small  aspect  of  the  total  problem  and  by 
abstracting  from  the  total  problem  everything  that  is  irrelevant  to  this  model.  The  art 
of  finding  the  right  abstractions  is  of  greater  importance  than  the  art  of  finding  the 
right  assertions.  If  a programmer  masters  the  former,  he  will  have  no  problem  with 
the  latter,  but  the  opposite  is  not  true. 

There  remains  the  problem  of  composition.  The  programmer  must  make  sure  that 
the  resulting  program  does  not  violate  the  conditions  that  allowed  him  to  prove  the 
correctness  of  the  abstractions.  In  most  cases  this  amounts  to  showing  that  some 
trivial  transformations  of  the  correct  programs  again  result  in  correct  programs.  It  is 
as  yet  not  clear  which  transformations  we  may  consider  as  trivial  in  the  sense  that  the 
correctness  of  the  transformed  program  follows  from  the  correctness  of  the  original 

program. 
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The  Correctness  Proof  of  a Quadratic-Hash  Algorithm 
Appendix 

The  generalized  repeat  statement  used  in  Section  f>  ic  a special  case  of  the 
iterative  statement.  The  various  forms  of  the  iterative  statement  are 


. 


{ for  <var>  in  {-}<range>  } repeat  SI*  until  BE  { do  S*  od  } 

{ for  <var>  in  {-}<range>  } while  BE  do  S*  od 
for  <var>  in  {-}<range>  do  S*  od 

The  { } pair  indicates  optional  parts.  S*  means  a sequence  of  zero  or  more 
statements.  The  iterative  statement  can  be  optionally  preceded  by  the  prefix 
with  <ded>*  }.  This  optional  prefix  allows  the  programmer  to  declare  data  local  to 
the  iterative  statement.  A range  usually  has  the  form 
[<indexexpression>  . . <indexexpression>}  It  may  also  be  an  array  name,  an 
expression  list  or  a range  type  name.  A range  type  is  defined  by  an  enumeration  of 
the  constants  of  th»»  type  (e.g.  rang#  day  - Sunday,  Monday,  Tuesday,  Wednesday, 
Thursday,  Friday,  Saturday  end).  The  optional  minus  sign  in  front  of  the  range 

indicates  whether  the  iterative  variable  should  step  through  the  range  in  reverse 
order. 

The  iterative  variable  in  the  for-clause  must  be  a declared  variable  of  the  same 
type  as  the  range  elements.  No  matter  whether  the  iterative  variable  is  changed  in 
the  sequence  of  statements  controlled  by  the  for-clause,  before  the  next  iteration 
commences,  the  next  value  in  the  range  is  assigned  to  the  iterative  variable.  A range 
defined  as  a pair  of  index  expressions  is  considered  to  be  empty  if  the  expression  left 
of  the  separator  is  greater  than  the  expressions  to  the  right  of  the  separator. 
The  for-statement  is  in  that  case  quivalent  to  an  empty  statement. 
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In  addition  to  these  iterative  statements  we  found  it  useful  to  have  two  iterative 
Boolean  expressions  which  correspond  to  predicates  prefixed  by  a single  logical 
quantor.  The  syntax  of  these  expressions  is 
some  <var>  in  {-}  <range>  sat  BE 
all  <var>  in  {-}  <range>  sat  BE 

The  delimiter  sat  is  an  abbreviation  for  "satisfy"  or  "satisfies".  The  first 
expression  corresponds  to  a predicate  prefixed  by  "there  exists",  the  second  to  a 
predicate  prefixed  by  "for  all".  The  expressions  are  not  independent;  the  one  can  be 
expressed  by  the  negation  of  the  other  if  the  Boolean  expression  BE  is  replaced  by 
not  BE.  However,  it  «s  convenient  to  have  both  and  the  one  is  not  more  difficult  to 
implement  than  the  other.  The  iterative  variable  is  a declared  variable  of  the  same 
type  as  the  range  elements.  If  the  variable  steps  through  the  entire  range,  the  final 
value  is  undefined.  Otherwise,  it  has  the  first  value  which  made  the  first  expression 
true  or  the  second  expression  false. 

Some  examples  of  applications.  Let  a square  matrix  A[0..p,0..p]  be  given.  The 
following  expression  returns  true  if  every  row  has  at  least  one  element  equal  to  zero, 
all  x in  [0..p]  sat  (soma  y in  [0..p]  sat  A[x,y]  - 0) 

Application  is  to  the  quadratic  hash  algorithm  leads  to  the  following  program. 
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integer  procedure  QH(name);  value  name;  integer  name; 
begin  integer  c.hash;  hash  H<name); 

QH 

if  some  c in  [0..(p-l)/2]  sat  ST[hash  (hash+c)2p]  - 0 or  ST[hash]  - name 
then  hash 

else 

if  some  c in  - [Up-l)/2]  sat  ST[hash  (hash+c)7.p]  - 0 or  ST[hash]  - name 
then  hash  else  -1  fi  fi; 
if  ST[hash]  ■ 0 then  ST[hash]  :■  name  fi 
end 


The  last  example  is  a procedure  SPACE  which  searches  through  the  global  bit 
array  6[0..top-l]  for  k contiguous  bits  equal  to  one.  If  found,  it  returns  the  index  of 
the  first  bit  of  this  area  j if  not  found,  the  procedure  returns  -1. 

integer  procedure  SPACE(k);  value  k;  integer  kj 
begin  integer  x,y,z;  x 0;  z k; 
while  z<-top  do 

if  some  y in  -[x..z-l]  sat  B[y]  - 0 
then  x:»z;  z:-y+k+l  else  RETURN(z-k)  fi  odj 
RETURNS- 1) 
end 


fhe  program  is  slightly  faster  if  y steps  through  the  range  in  reverse  order, 
because  in  that  case  y points  the  rightmost  zero  in  the  inspected  area.  Variable  y 
steps  through  the  range  in  reverse  order  if  the  range  expression  is  preceded  by  a 
minus  sign. 
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